NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Integration of 168,000 samples reveals global patterns of the human gut microbiome

https://doi.org/10.1016/j.cell.2024.12.017

Abdill, Richard J; Graham, Samantha P; Rubinetti, Vincent; Ahmadian, Mansooreh; Hicks, Parker; Chetty, Ashwin; McDonald, Daniel; Ferretti, Pamela; Gibbons, Elizabeth; Rossi, Marco; et al (February 2025, Cell)

Free, publicly-accessible full text available February 1, 2026
Edge-Informed Estimation of Gaussian Point Spread Functions in Convolutional Blurring Models

https://doi.org/10.1109/CISA60639.2024.10576511

Hume, Jacob; McDonald, Daniel; Newman, Allan; Liveoak, Donald; Hristova, Yulia; Viswanathan, Aditya (May 2024, IEEE)

The underlying physics of imaging processes and associated instrumentation limitations mean that blurring artifacts are unavoidable in many applications such as astronomy, microscopy, radar and medical imaging. In several such imaging modalities, convolutional models are used to describe the blurring process; the observed image or function is a convolution of the true underlying image and a point spread function (PSF) which characterizes the blurring artifact. In this work, we propose and analyze a technique - based on convolutional edge detectors and Gaussian curve fitting - to approximate unknown Gaussian PSFs when the underlying true function is piecewise-smooth. For certain simple families of such functions, we show that this approximation is exponentially accurate. We also provide preliminary two dimensional extensions of this technique. These findings - confirmed by numerical simulations - demonstrate the feasibility of recovering accurate approximations to the blurring function, which serves as an important prerequisite to solving deblurring problems.
more » « less
Full Text Available
Generation of accurate, expandable phylogenomic trees with uDance

https://doi.org/10.1038/s41587-023-01868-8

Balaban, Metin; Jiang, Yueyu; Zhu, Qiyun; McDonald, Daniel; Knight, Rob; Mirarab, Siavash (May 2024, Nature Biotechnology)

Phylogenetic trees provide a framework for organizing evolutionary histories across the tree of life and aid downstream comparative analyses such as metagenomic identification. Methods that rely on single-marker genes such as 16S rRNA have produced trees of limited accuracy with hundreds of thousands of organisms, whereas methods that use genome-wide data are not scalable to large numbers of genomes. We introduce updating trees using divide-and-conquer (uDance), a method that enables updatable genome-wide inference using a divide-and-conquer strategy that refines different parts of the tree independently and can build off of existing trees, with high accuracy and scalability. With uDance, we infer a species tree of roughly 200,000 genomes using 387 marker genes, totaling 42.5 billion amino acid residues.
more » « less
Full Text Available
Optimizing UniFrac with OpenACC Yields Greater Than One Thousand Times Speed Increase

https://doi.org/10.1128/msystems.00028-22

Sfiligoi, Igor; Armstrong, George; Gonzalez, Antonio; McDonald, Daniel; Knight, Rob (June 2022, mSystems)
Greene, Casey S. (Ed.)
ABSTRACT UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another (beta diversity). Striped UniFrac recently added the ability to split the problem into many independent subproblems, exhibiting nearly linear scaling but suffering from memory contention. Here, we adapt UniFrac to graphics processing units using OpenACC, enabling greater than 1,000× computational improvement, and apply it to 307,237 samples, the largest 16S rRNA V4 uniformly preprocessed microbiome data set analyzed to date. IMPORTANCE UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another. Here, we adapt UniFrac to operate on graphics processing units, enabling a 1,000× computational improvement. To highlight this advance, we perform what may be the largest microbiome analysis to date, applying UniFrac to 307,237 16S rRNA V4 microbiome samples preprocessed with Deblur. These scaling improvements turn UniFrac into a real-time tool for common data sets and unlock new research questions as more microbiome data are collected.
more » « less
Full Text Available
Host biology, ecology and the environment influence microbial biomass and diversity in 101 marine fish species

https://doi.org/10.1038/s41467-022-34557-2

Minich, Jeremiah J.; Härer, Andreas; Vechinski, Joseph; Frable, Benjamin W.; Skelton, Zachary R.; Kunselman, Emily; Shane, Michael A.; Perry, Daniela S.; Gonzalez, Antonio; McDonald, Daniel; et al (December 2022, Nature Communications)

Abstract Fish are the most diverse and widely distributed vertebrates, yet little is known about the microbial ecology of fishes nor the biological and environmental factors that influence fish microbiota. To identify factors that explain microbial diversity patterns in a geographical subset of marine fish, we analyzed the microbiota (gill tissue, skin mucus, midgut digesta and hindgut digesta) from 101 species of Southern California marine fishes, spanning 22 orders, 55 families and 83 genera, representing ~25% of local marine fish diversity. We compare alpha, beta and gamma diversity while establishing a method to estimate microbial biomass associated with these host surfaces. We show that body site is the strongest driver of microbial diversity while microbial biomass and diversity is lowest in the gill of larger, pelagic fishes. Patterns of phylosymbiosis are observed across the gill, skin and hindgut. In a quantitative synthesis of vertebrate hindguts (569 species), we also show that mammals have the highest gamma diversity when controlling for host species number while fishes have the highest percent of unique microbial taxa. The composite dataset will be useful to vertebrate microbiota researchers and fish biologists interested in microbial ecology, with applications in aquaculture and fisheries management.
more » « less
Full Text Available
Greengenes2 unifies microbial data in a single reference tree

https://doi.org/10.1038/s41587-023-01845-1

McDonald, Daniel; Jiang, Yueyu; Balaban, Metin; Cantrell, Kalen; Zhu, Qiyun; Gonzalez, Antonio; Morton, James T.; Nicolaou, Giorgia; Parks, Donovan H.; Karst, Søren M.; et al (July 2023, Nature Biotechnology)

Abstract Studies using 16S rRNA and shotgun metagenomics typically yield different results, usually attributed to PCR amplification biases. We introduce Greengenes2, a reference tree that unifies genomic and 16S rRNA databases in a consistent, integrated resource. By inserting sequences into a whole-genome phylogeny, we show that 16S rRNA and shotgun metagenomic data generated from the same samples agree in principal coordinates space, taxonomy and phenotype effect size when analyzed with the same tree.
more » « less
Porting and optimizing UniFrac for GPUs: Reducing microbiome analysis runtimes by orders of magnitude

https://doi.org/10.1145/3311790.3399614

Sfiligoi, Igor; McDonald, Daniel; Knight, Rob (July 2020, Practice and Experience in Advanced Research Computing (PEARC20))
null (Ed.)
Full Text Available
STREAMS guidelines: standards for technical reporting in environmental and host-associated microbiome studies

https://doi.org/10.1038/s41564-025-02186-2

Kelliher, Julia M; Mirzayi, Chloe; Bordenstein, Sarah R; Oliver, Aaron; Kellogg, Christina A; Hatcher, Eneida L; Berg, Maureen; Baldrian, Petr; Aljumaah, Mashael; Miller, Cassandra_Maria Luz; et al (December 2025, Nature Microbiology)

Free, publicly-accessible full text available December 1, 2026
Compressed and Penalized Linear Regression

https://doi.org/10.1080/10618600.2019.1660179

Homrighausen, Darren; McDonald, Daniel J. (April 2020, Journal of Computational and Graphical Statistics)

Full Text Available
Algorithms for Estimating Trends in Global Temperature Volatility

https://doi.org/10.1609/aaai.v33i01.3301614

Khodadadi, Arash; McDonald, Daniel J. (July 2019, Proceedings of the AAAI Conference on Artificial Intelligence)

Trends in terrestrial temperature variability are perhaps more relevant for species viability than trends in mean temperature. In this paper, we develop methodology for estimating such trends using multi-resolution climate data from polar orbiting weather satellites. We derive two novel algorithms for computation that are tailored for dense, gridded observations over both space and time. We evaluate our methods with a simulation that mimics these data’s features and on a large, publicly available, global temperature dataset with the eventual goal of tracking trends in cloud reflectance temperature variability.
more » « less
Full Text Available

« Prev Next »

Search for: All records